Refinement of Document Clustering by Using NMF
نویسندگان
چکیده
In this paper, we use non-negative matrix factorization (NMF) to refine the document clustering results. NMF is a dimensional reduction method and effective for document clustering, because a term-document matrix is high-dimensional and sparse. The initial matrix of the NMF algorithm is regarded as a clustering result, therefore we can use NMF as a refinement method. First we perform min-max cut (Mcut), which is a powerful spectral clustering method, and then refine the result via NMF. Finally we should obtain an accurate clustering result. However, NMF often fails to improve the given clustering result. To overcome this problem, we use the Mcut object function to stop the iteration of NMF.
منابع مشابه
Ping-pong Document Clustering using NMF and Linkage-Based Refinement
This paper proposes a ping-pong document clustering method using NMF and the linkage based refinement alternately, in order to improve the clustering result of NMF. The use of NMF in the ping-pong strategy can be expected effective for document clustering. However, NMF in the ping-pong strategy often worsens performance because NMF often fails to improve the clustering result given as the initi...
متن کاملCluster-based language model for spoken document retrieval using NMF-based document clustering
In this paper, a non-negative matrix factorization (NMF)based document clustering approach is proposed for the cluster-based language model for spoken document retrieval. The retrieval language model comprises three different unigram models: a whole corpus collect-based unigram, documentbased unigram, and a document clustering-based unigram. They are combined with double linear interpolations. ...
متن کاملEnsemble document clustering using weighted hypergraph generated by NMF
In this paper, we propose a new ensemble document clustering method. The novelty of our method is the use of Non-negative Matrix Factorization (NMF) in the generation phase and a weighted hypergraph in the integration phase. In our experiment, we compared our method with some clustering methods. Our method achieved the best results.
متن کاملNonnegative Matrix Factorization with Orthogonality Constraints
Nonnegative matrix factorization (NMF) is a popular method for multivariate analysis of nonnegative data, the goal of which is to decompose a data matrix into a product of two factor matrices with all entries in factor matrices restricted to be nonnegative. NMF was shown to be useful in a task of clustering (especially document clustering), but in some cases NMF produces the results inappropria...
متن کاملSubtractive Initialization of Nonnegative Matrix Factorizations for Document Clustering
Nonnegative matrix factorizations (NMF) have recently assumed an important role in several fields, such as pattern recognition, automated image exploitation, data clustering and so on. They represent a peculiar tool adopted to obtain a reduced representation of multivariate data by using additive components only, in order to learn parts-based representations of data. All algorithms for computin...
متن کامل